Malaria Detection¶
Problem Definition
The context: Why is this problem important to solve?
A traditional diagnosis of malaria detection, involves careful inspection by experience professional to distinguish healthy and infect red
blood cells. It is tedious and time consuming job as well as it depends on accuracy of person who is doing inspection e.g. human expertise.
In this project, we are trying to automate this process to help diagnosis the problem as described above to have AI/ML based solution
that can support and enhance the diagnosis of malaria detection.
The objectives: What is the intended goal?
Our intention in this project is to streamline diagnosis process which heavily depends on human expertise and naked eye's observation
capabilities. Intended goal of this project is to explore opportunities to develop and deploy AI/ML based deep learning solution to detect
and classify if the images of red blood cell collected from the spcimen of patients are infected with plasmodium parasite or not. In turn it
will enable and provide tool for professional to expedite their investigation as well save time in determining a huge number of population.
that faces this pandemic over the years in underdeveloped part of the world.
This tool's purpose is to support and enhance diagnostics capabilites and provide instrument that can expedite the diagnostics process.
The key questions: What are the key questions that need to be answered? A Key questions that need to be answered are following:
- Can we build a model: AI/ML based that can aid the detection of parasite from the image of red blood cell given as test speciment.
- Can we classifiy the red blood cell images into infected/not infected cells.
The problem formulation: What is it that we are trying to solve using data science? As a part of solution we intend to solve to following:
- Ability to divide labelled data-set into training and testing data set. In this case it is already divided.
- Perform EDA on the data set to explore common behavior, patterns and attributes that can be used to design a AI/ML software based models.
- Design varoius models based on ANN/CNN etc that can be trained and tested against the divided data sets.
- Tune the model to the outcome that closes matches our requirement.
- Design/Conclude the model, if it is possible to deploy in the enviornment. At what capacity one can trust the model and use to help/aid
their decision process. 6. Future training of the model and refinements 7. Performance metrics as well as cost of deploying such solutions. (if acceptable to deploy)
Data Description
There are a total of 24,958 train and 2,600 test images (colored) that we have taken from microscopic images. These images are of the following categories:
Parasitized: The parasitized cells contain the Plasmodium parasite which causes malaria
Uninfected: The uninfected cells are free of the Plasmodium parasites
Mount the Drive
There is no need to mount the drive as i am using jupyter notebook on anaconda where the images are stored locally to the folder. So this step is not needed.
# No need to mount the drive as the script uses local folders to open the images and anaconda environment
# to run the scripts.
Loading libraries¶
# uncomment the below line if cv2 is not accessible. - once run you dont need to do that again.
#!pip install opencv-python opencv-python-headless
import cv2
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import tensorflow as tf
from sklearn.metrics import confusion_matrix
from tensorflow.keras import layers, models
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from typing import Tuple, Iterable
Let us load the data¶
Note:
- You must download the dataset from the link provided on Olympus and upload the same to your Google Drive. Then unzip the folder.
# load the existing files into tensorflow data set - basically it has
# infected and uninfected labels - which are considered classes
# we would like to load images as 64 x 64 as RGB channel into 100 batch sizes.
# now these options can be changed - initially started with 252 x 252 which was
# causing lot of problem , a kernel resets everytime and the cpu was not able to
# keep up. so make sure - the size is optimized for processing
# it returns tensorflow.data.DataSet
def create_image_dataset(
directory: str,
image_size: tuple = (64, 64),
batch_size: int = 100
) -> tf.data.Dataset:
return tf.keras.utils.image_dataset_from_directory(
directory=directory,
label_mode="int",
image_size=image_size,
batch_size=batch_size
)
# the images are extracted locally on the device and it is not uploaded to google drive.
# so there is no need to mount the google drive. THe environment used is anaconda and jupyter notebook.
# Create a training dataset - this will be tensorflow dataset object.
train_ds = create_image_dataset(directory='cell_images/train')
# Create a testing dataset of similar type.
test_ds = create_image_dataset(directory='cell_images/test')
Found 24958 files belonging to 2 classes. Found 2600 files belonging to 2 classes.
The extracted folder has different folders for train and test data will contain the different sizes of images for parasitized and uninfected cells within the respective folder name.
The size of all images must be the same and should be converted to 4D arrays so that they can be used as an input for the convolutional neural network. Also, we need to create the labels for both types of images to be able to train and test the model.
Let's do the same for the training data first and then we will use the same code for the test data as well.
# convert the dataset into 4D array as to process for the convolutional network this is a basic need.
# this shall be done for both the training as well as testing data set.
# there is probably no need to do this as CNN can directly take the tensorflow.dataset as is to consume
# into its model.
def convert_to_4d_array(dataset: Iterable) -> Tuple[np.ndarray, np.ndarray]:
'''
convert image array into 4d image consumable
for cnn modeling
the dataset has image and label lists that can be extracted
image list and lable list.
'''
images_list = []
labels_list = []
# extract image and labels from the dataset for processing
# The warning for OUT_OF_RANGE is basically for the last - batch as it ends early typically
# a good practice is to divide data into equal batches - but with train data set it is not possible as
# there is a number which is almost prime - after you divide it by 2. There is no bigger factor of that
# number that can be used into batch size.
for images, labels in dataset:
images_list.append(images.numpy())
labels_list.append(labels.numpy())
X = np.concatenate(images_list, axis=0)
y = np.concatenate(labels_list, axis=0)
return X, y
X_train, y_train = convert_to_4d_array(train_ds)
X_test, y_test = convert_to_4d_array(test_ds)
2025-04-13 20:48:32.850692: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence 2025-04-13 20:48:33.301435: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Check the shape of train and test images¶
print("Image shape:", X_train.shape)
print("Label shape:", y_train.shape)
Image shape: (24958, 64, 64, 3) Label shape: (24958,)
Check the shape of train and test labels¶
print("Image shape:", X_test.shape)
print("Label shape:", y_test.shape)
# the image in the data set are arranged into 100 image batch size
# it shows the images are already in 4D
for batch in train_ds.take(1):
images,label = batch
print( "Image Shape on the dataset:", images.shape )
Image shape: (2600, 64, 64, 3) Label shape: (2600,) Image Shape on the dataset: (100, 64, 64, 3)
Observations and insights: ¶
Based on the Shape attributes - we know that the testing data set: has 24958 images each with 64x64 size - as we loaded using the previous image loading api. It has 3 channels - which represents each RGB components.
For the testing data set - we have 2600 entries - the images size and channels is similar to the training data set.
The label data set is single dimenstion totalling 24958 into training dataset and 2600 in to testing data set.
It is important to know the numbers and shape - because we will be using the numebrs to find out which model is more closer to the number and how model is able to perform in terms of mis-classification and proper classificaiton - the confusion metrics is a simple way to do the comparision our goal is to reduce false +/- readings into confusion metrics. We will look at this number closely into performance of each model.
Check the minimum and maximum range of pixel values for train and test images
# Its good to check minimum and maximum range of pixel however not sure if this is going to provide any
# valuable information for the classificaiton or distingushing patterns in the dataset
def get_min_max_pixel_values( dataset ):
'''get min and max of image'''
min_pixel_value = float( 'inf' ) # int with the max
max_pixel_value = float( '-inf' ) # int with min
for images, labels in dataset:
min_pixel_value = min( min_pixel_value, tf.reduce_min( images ).numpy() )
max_pixel_value = max( max_pixel_value, tf.reduce_max( images ).numpy() )
return min_pixel_value, max_pixel_value
# Get min and max pixel values for training and test datasets
train_min, train_max = get_min_max_pixel_values( train_ds )
test_min, test_max = get_min_max_pixel_values( test_ds )
print(f"Training dataset - Min pixel value: { train_min }, Max pixel value: { train_max }")
print(f"Test dataset - Min pixel value: { test_min }, Max pixel value: { test_max }")
# Access class names and store it
class_names = train_ds.class_names
print(f"Class names: {class_names}")
Training dataset - Min pixel value: 0.0, Max pixel value: 255.0 Test dataset - Min pixel value: 0.0, Max pixel value: 255.0 Class names: ['parasitized', 'uninfected']
2025-04-13 20:48:34.868879: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Observations and insights:
THe max and min is 255 and 0, which means the range of the pixel is all possible values in RGB range. there is nothing that stands out in terms of minimum or maximum values for the pixel for our data exploration. It is good to know, as in certain cases the pixel colors are prominent and there is no specific color pixel present at all. In those circumstances it is important to notice such data and plan or enhance model to leverage certain traiits.
Count the number of values in both uninfected and parasitized
# Lets pen out how many data we have in each class for training and testing data set - this way we know what to expect when
# the model is evaluated.
def count_classes( dataset ):
'''count number of items in each class per dataset'''
class_counts = { 0:0, 1:0 }
for images, labels in dataset:
for label in labels.numpy():
class_counts[label] +=1
return class_counts
train_class_counts = count_classes( train_ds )
test_class_counts = count_classes( test_ds )
print( f"total count in training set: {class_names[0]}: {train_class_counts[0]} and {class_names[1]}: {train_class_counts[1]}" )
print( f"total count in test set: {class_names[0]}: {test_class_counts[0]} and {class_names[1]}: {test_class_counts[1]}" )
total count in training set: parasitized: 12582 and uninfected: 12376 total count in test set: parasitized: 1300 and uninfected: 1300
Normalize the images
# for CNN the image data shall be normalized which means we need to bring back between 0-1 range
# the way to it similar to elective project - however it was more of monochrome image - here was have
# RGB image - for normalization it does not impact - so we will follow the similar pattern of
# dividing each pixel by 255
X_train_normalized = X_train / 255.0
X_test_normalized = X_test / 255.0
def normalize_image(image, label):
image = tf.cast(image, tf.float32) / 255.0
return image, label
# normalize the training and data-set we are going to use this for the rest of the modeling.
train_ds_normalized = train_ds.map(normalize_image)
test_ds_normalized = test_ds.map(normalize_image)
print("Normalized image shape Training:", X_train_normalized.shape) # (num_samples, height, width, channels)
print("Normalized image shape Test:", X_test_normalized.shape) # (num_samples, height, width, channels)
Normalized image shape Training: (24958, 64, 64, 3) Normalized image shape Test: (2600, 64, 64, 3)
Observations and insights:¶
There are 24948 images in training set and 2600 images in testing set. They are loaded as 64 by 64 images size and there are 3 channels - representing RGB colors.
Plot to check if the data is balanced
def show_bar_plot( dataset, name='training' ):
''' show bar plot for the data set'''
count_class = count_classes( dataset )
# Plotting the class distribution for training set
sns.barplot(x=list(count_class.keys()), y=list(count_class.values()))
plt.title( f'Class Distribution in {name} Dataset')
plt.xlabel('Class')
plt.ylabel('Number of Images')
plt.show()
show_bar_plot( train_ds )
2025-04-13 20:48:38.241864: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
show_bar_plot( test_ds , name='Test' )
Observations and insights:
The Sample plot on Training class and testing class shows proper balance of samples. It does not show any problem moving forward in terms of imbalance is concerned.
Data Exploration¶
Let's visualize the images from the train data
def show_3_by_3_image_of_dataset( dataset ):
''' show 3x3 image matrix for the dataset '''
images, labels = next(iter(dataset))
# Plot a few images from the batch
plt.figure(figsize=(10, 10))
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i].numpy()])
plt.axis("off")
plt.show()
show_3_by_3_image_of_dataset( train_ds )
Observations and insights:¶
The one clear distinction is that the uninfected one are clear one where the infected one has visible patterns in the cell. I believe these patterns can be modeled into the AI to better predict the classificaiton. It also says uninfected one are clear and many of them does not show impurites in terms of pink circles or arches
Visualize the images with subplot(6, 6) and figsize = (12, 12)
def show_6_by_6_image_of_dataset( dataset ):
'''show 6x6 regular rgb image'''
# Get one batch of images and labels
images, labels = next(iter(dataset))
plt.figure(figsize=(12, 12))
for i in range(36):
ax = plt.subplot(6, 6, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i].numpy()])
plt.axis("off")
plt.show()
show_6_by_6_image_of_dataset( train_ds )
Observations and insights:
Looking at the images in 3 x 3 and 6 x 6 columns of the data - there is no easy way to find out if the infect and parasitized images can be figured out easily - there is definitely a dots that are visible in parasitized images - for sure. which can be a definite difference compared to uninfected cells.
With the model probably we shall be able to figure out these dots in the cells that can make model better and our way to classify the problem better.
Plotting the mean images for parasitized and uninfected
# there shall be better way to just get the average - see if we can find it
# Lets find the sum and then divide by the total data set
parasitized_sum = np.zeros((64, 64, 3), dtype=np.float32)
uninfected_sum = np.zeros((64, 64, 3), dtype=np.float32)
parasitized_count = 0
uninfected_count = 0
for images, labels in train_ds:
for i in range(len(labels)):
if labels[i].numpy() == 1: # Uninfected class
uninfected_sum += images[i].numpy()
uninfected_count += 1
elif labels[i].numpy() == 0: # Parasitized class
parasitized_sum += images[i].numpy()
parasitized_count += 1
mean_parasitized_image = parasitized_sum / parasitized_count
mean_uninfected_image = uninfected_sum / uninfected_count
Mean image for parasitized
# Plot the mean images
plt.figure(figsize=(12, 6))
# Plot the mean image for the parasitized class
plt.subplot(1, 2, 1)
plt.imshow(mean_parasitized_image.astype("uint8"))
plt.title("Mean Image - Parasitized")
plt.axis("off")
(-0.5, 63.5, 63.5, -0.5)
Mean image for uninfected
# Plot the mean images
plt.figure(figsize=(12, 6))
# Plot the mean image for the uninfected class
plt.subplot(1, 2, 2)
plt.imshow(mean_uninfected_image.astype("uint8"))
plt.title("Mean Image - Uninfected")
plt.axis("off")
plt.show()
Observations and insights:
The mean infected and uninfected images does not provide any way to differentiate images that can easily guide us a way to classify if the images is infected with malaria virus or not. The regular images are better instead of the avg image.
The mean image has lot of blur - it does not even convey the outer shape - as well the borders are showing texture of the pixels
Converting RGB to HSV of Images using OpenCV
Converting the train data
# define a converter function that will act as a map to do rgb->hsv
def rgb_to_hsv_batch(images, labels):
images_hsv = tf.image.rgb_to_hsv(tf.cast(images, tf.float32) / 255.0) # Normalize to [0, 1]
return images_hsv, labels
# convert
train_ds_hsv = train_ds.map(rgb_to_hsv_batch)
def show_6_x_6_hsv_images( dataset ):
'''show 6x6 matrix of HSV images'''
# Visualize a batch from the HSV dataset
# Get a batch of images
images_hsv, labels = next(iter(dataset))
plt.figure(figsize=(12, 12))
for i in range(36):
ax = plt.subplot(6, 6, i + 1)
plt.imshow(images_hsv[i, :, :, 0], cmap='hsv')
plt.title(class_names[labels[i].numpy()])
plt.axis("off")
plt.show()
show_6_x_6_hsv_images( train_ds_hsv )
Converting the test data
# Map the conversion function over the dataset
test_ds_hsv = test_ds.map(rgb_to_hsv_batch)
# show HSV images mainly 6x6 matrix
show_6_x_6_hsv_images( test_ds_hsv )
Observations and insights:¶
HSV images are more confusion and does not provide a clear classification to our eyes. RGB images are better in terms of clear difference. The HSV images are typically shows lot of reds and some blues there are no clear indicator that can be used. There are some mixed bags - like one of the blue cells shows up infect and uninfected at the same time. This is a clear indication of confusio and it may not help in modeling if these confusions are visible even through naked eyes.
in my opinion, RGB images are still better to stick with the modeling.
Processing Images using Gaussian Blurring
# a function that converts regular RGB image into blurred images using open CV
# library's gaussian blur method.
# for blurring it needs to take normalized images - passing regular image throws error.
def convert_to_blurred_images( data_set ):
'''coverts dataset - containing normnalized image into blurred image'''
imgs = []
for i in np.arange(0, 100, 1):
b = cv2.GaussianBlur(data_set[i], (5, 5), 0)
imgs.append(b)
imgs = np.array(imgs)
return imgs
# convert the normalized train data set into blurred image
train_blurred_imgs = convert_to_blurred_images( X_train_normalized )
# convert the test data set normalized image into blurred image
test_blurred_imgs = convert_to_blurred_images( X_test_normalized )
Gaussian Blurring on train data
def show_6_x_6_blurred_images( blurred_imgs ):
'''shows 6x6 blurr images - it is similar to other funciton but just called out to look different.'''
plt.figure(figsize=(12, 12))
for i in range(36):
plt.subplot(6, 6, i+1)
plt.imshow(blurred_imgs[i])
plt.title(class_names[labels[i].numpy()])
plt.axis('off')
plt.show()
# show training images - that are blurred.
show_6_x_6_blurred_images( train_blurred_imgs )
Gaussian Blurring on test data
# show the testing blurred images
show_6_x_6_blurred_images( test_blurred_imgs )
Observations and insights:
blurred image also does not provide better classification than RGB images.
Think About It: Would blurring help us for this problem statement in any way? What else can we try?
No blurring does not provide any added advantage in my opinion. Yes, it can certainly help in some cases but this may not be the case for it. Certain things we can try: but i am not sure. It may be to create monochrome images or - some other modelling - but i am not sure if it is going to make major difference so it is bette to start playing with some models and see its output.
Model Building¶
Base Model: Model 1¶
Note: The Base Model has been fully built and evaluated with all outputs shown to give an idea about the process of the creation and evaluation of the performance of a CNN architecture. A similar process can be followed in iterating to build better-performing CNN architectures.
Importing the required libraries for building and training our Model
# already imported required libraries. so it is good.
One Hot Encoding the train and test labels
# Function to one-hot encode the labels
def one_hot_encode(labels, num_classes):
'''a mapping function for the one hot encoding'''
return tf.one_hot(labels, num_classes)
num_classes = 2
# Function to apply one-hot encoding to each batch in the dataset
def one_hot_encode_batch(images, labels):
'''apply one hot encoding to the batch using one_hot_encode'''
labels_one_hot = one_hot_encode(labels, num_classes)
return images, labels_one_hot
# map the one hot encoding funciton.
train_ds_one_hot = train_ds_normalized.map(lambda x, y: one_hot_encode_batch(x, y))
test_ds_one_hot = test_ds_normalized.map(lambda x, y: one_hot_encode_batch(x, y))
type( train_ds_one_hot )
tensorflow.python.data.ops.map_op._MapDataset
Building the model¶
def create_base_cnn_model( num_classes = 2 ):
'''a base cnn model - we will keep explore other models based on that'''
model = models.Sequential([
# Convolutional Layer 1
tf.keras.layers.Input(shape=(64, 64, 3)),
layers.Conv2D(32, (3, 3), activation='relu' ),
layers.MaxPooling2D((2, 2)),
# Convolutional Layer 2
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
# Convolutional Layer 3
layers.Conv2D(64, (3, 3), activation='relu'),
# Flattening Layer
layers.Flatten(),
# Fully Connected Layer
layers.Dense(64, activation='relu'),
# Output Layer (softmax for classification)
layers.Dense(num_classes, activation='softmax')
])
return model
Compiling the model
# CNN based basic model (64x64 images, 2 classes)
cnn_base_model = create_base_cnn_model()
cnn_base_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
cnn_base_model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 62, 62, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 31, 31, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 29, 29, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 14, 14, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 12, 12, 64) │ 36,928 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 9216) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 64) │ 589,888 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 2) │ 130 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 646,338 (2.47 MB)
Trainable params: 646,338 (2.47 MB)
Non-trainable params: 0 (0.00 B)
Using Callbacks
# we will use these callbacks to keep a tab on the modeling - we may use it for specific advance model
# or not. but over all one does early stopping if the val-loss patience is out of range. We know
# model might not be giving us optimistics values. As well model checkpoint - for its weight if we
# want to use it or probably monitor it.
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
model_checkpoint = ModelCheckpoint('.mdl_wts.keras', monitor='val_loss', save_best_only=True)
Fit and train our Model
# lets fit the basic model - a batch-size of 1000, and keeping epochs to 20.
cnn_base_model_fit = cnn_base_model.fit(train_ds_one_hot, batch_size=1000, epochs=20, validation_data= test_ds_one_hot, callbacks=[early_stopping,model_checkpoint] )
cnn_base_model_fit.history
Epoch 1/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 19s 73ms/step - accuracy: 0.6080 - loss: 0.6520 - val_accuracy: 0.8873 - val_loss: 0.3210 Epoch 2/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 18s 73ms/step - accuracy: 0.9105 - loss: 0.2213 - val_accuracy: 0.9396 - val_loss: 0.1496 Epoch 3/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 19s 74ms/step - accuracy: 0.9592 - loss: 0.1273 - val_accuracy: 0.9569 - val_loss: 0.1252 Epoch 4/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 19s 76ms/step - accuracy: 0.9686 - loss: 0.1059 - val_accuracy: 0.9727 - val_loss: 0.1016 Epoch 5/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 20s 81ms/step - accuracy: 0.9743 - loss: 0.0861 - val_accuracy: 0.9796 - val_loss: 0.0908 Epoch 6/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 84ms/step - accuracy: 0.9771 - loss: 0.0709 - val_accuracy: 0.9819 - val_loss: 0.0588 Epoch 7/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 86ms/step - accuracy: 0.9794 - loss: 0.0617 - val_accuracy: 0.9835 - val_loss: 0.0548 Epoch 8/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 85ms/step - accuracy: 0.9816 - loss: 0.0528 - val_accuracy: 0.9808 - val_loss: 0.0540 Epoch 9/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 85ms/step - accuracy: 0.9839 - loss: 0.0453 - val_accuracy: 0.9804 - val_loss: 0.0574 Epoch 10/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 85ms/step - accuracy: 0.9864 - loss: 0.0397 - val_accuracy: 0.9819 - val_loss: 0.0597 Epoch 11/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 86ms/step - accuracy: 0.9890 - loss: 0.0333 - val_accuracy: 0.9781 - val_loss: 0.0788 Epoch 12/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 83ms/step - accuracy: 0.9907 - loss: 0.0288 - val_accuracy: 0.9792 - val_loss: 0.0704 Epoch 13/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 22s 87ms/step - accuracy: 0.9921 - loss: 0.0242 - val_accuracy: 0.9762 - val_loss: 0.0837
{'accuracy': [0.6831076145172119,
0.9302027225494385,
0.9631781578063965,
0.9717926383018494,
0.9779229164123535,
0.9786040782928467,
0.9799663424491882,
0.9834922552108765,
0.9848545789718628,
0.987939715385437,
0.9901434183120728,
0.9922269582748413,
0.9935491681098938],
'loss': [0.5813901424407959,
0.17600655555725098,
0.11672408133745193,
0.09504634141921997,
0.07482770085334778,
0.06146879494190216,
0.05700996518135071,
0.047179900109767914,
0.04135500267148018,
0.034457284957170486,
0.029271820560097694,
0.023525379598140717,
0.01986163668334484],
'val_accuracy': [0.8873077034950256,
0.9396153688430786,
0.9569230675697327,
0.9726923108100891,
0.9796153903007507,
0.9819231033325195,
0.9834615588188171,
0.9807692170143127,
0.9803845882415771,
0.9819231033325195,
0.9780769348144531,
0.9792307615280151,
0.9761538505554199],
'val_loss': [0.32096830010414124,
0.149576798081398,
0.1252390742301941,
0.10157105326652527,
0.09077686816453934,
0.0588226318359375,
0.054756857454776764,
0.05401620641350746,
0.05739428102970123,
0.059661153703927994,
0.07884502410888672,
0.07038698345422745,
0.08372274786233902]}
Evaluating the model on test data
def evaluate_and_print_accuracy( model, dataset=train_ds_one_hot ):
'''
A function that takes in model - and evaluates with testing dataset -
it uses one hot encoded testing data set - unless otherwise.'''
# evaluate training ds with the model in the argument
evel_train = model.evaluate(dataset)
# evaluate testing ds with
evel_test = model.evaluate( test_ds_one_hot)
print( f"Training Accuracy: {evel_train[1]}, Loss: {evel_train[0]}" )
print( f"Validation Accuracy: {evel_test[1]}, Loss: {evel_test[0]}" )
return evel_train, evel_test
model_list = []
e_train,e_test = evaluate_and_print_accuracy(cnn_base_model)
model_list.append( ("Model 1: cnn_base_model",e_train,e_test) )
len(model_list)
250/250 ━━━━━━━━━━━━━━━━━━━━ 7s 28ms/step - accuracy: 0.9826 - loss: 0.0479 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 30ms/step - accuracy: 0.9838 - loss: 0.0488 Training Accuracy: 0.985575795173645, Loss: 0.04104594141244888 Validation Accuracy: 0.9807692170143127, Loss: 0.05401621386408806
1
def show_models_so_far():
print(model_list[0])
accu_max = 0
max_tuple = None
for model in model_list:
print(f"{model[0]}: test-accuracy:{model[2][1]}")
if( accu_max < model[2][1]):
accu_max = model[2][1]
max_tuple = model
print("\n")
print(f"The best model: {max_tuple[0]}, with test-accuracy: {accu_max}")
show_models_so_far()
('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127])
Model 1: cnn_base_model: test-accuracy:0.9807692170143127
The best model: Model 1: cnn_base_model, with test-accuracy: 0.9807692170143127
Plotting the confusion matrix
# Define the function to get true and predicted labels from the dataset
def get_true_and_pred_labels(dataset, model):
'''
A helper function uses prediction - to get the labels and compare
against the true lable
'''
y_true = []
y_pred = []
for images, labels in dataset:
# lets run the model for its prediction - see how it performs on dataset
predictions = model.predict(images)
#convert to class labels
predicted_classes = np.argmax(predictions, axis=1)
# Get the true labels (if one-hot encoded, convert them to class labels)
true_classes = np.argmax(labels, axis=1)
y_true.extend(true_classes)
y_pred.extend(predicted_classes)
return np.array(y_true), np.array(y_pred)
def plot_confusion_matrix( cm ):
'''
plot the confusion matrix mainly generated using the confusionMatrix api once the model fitting and evaluation is done.
'''
plt.figure(figsize=(8, 6))
sns.heatmap( cm, annot=True, fmt="d", cmap="Blues", xticklabels=class_names, yticklabels=class_names )
plt.title('Confusion Matrix')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.show()
# Get true and predicted labels from the test dataset
y_true, y_pred = get_true_and_pred_labels(test_ds_one_hot, cnn_base_model)
# Compute confusion matrix of true label to predicted label
cm = confusion_matrix(y_true, y_pred)
# plot the confusion matrix
plot_confusion_matrix(cm)
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 18ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step
Plotting the train and validation curves
def plot_training_and_validation_graph( model_history ):
'''
A utility function that will take model_history an output of model fitting to plot the validation and training graphs for
loss as well accuracy respectively.
'''
# Plot Training & Validation Loss
plt.figure(figsize=(12, 6))
# Plot Training loss
plt.subplot(1, 2, 1)
plt.plot(model_history.history['loss'], label='Training Loss')
plt.plot(model_history.history['val_loss'], label='Validation Loss')
plt.title('Training vs Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# plot the validation graph
plt.subplot(1, 2, 2)
plt.plot(model_history.history['accuracy'], label='Training Accuracy')
plt.plot(model_history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Training vs Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
# plot the graph
plt.tight_layout()
plt.show()
plot_training_and_validation_graph( cnn_base_model_fit )
Performance of Base Model: Model 1
- computation speed : 22s per epoch
- Training Accuracy : 0.9826, Loss: 0.0479
- Validation Accuracy : 0.9838, Loss: 0.0488
- False Positive : 19
- False Negative : 31
- Total errors : 50 out of 2600
- Results : A promising start for the base model - especially the acuracy on test and validation data is coming very close.
So now let's try to build another model with few more add on layers and try to check if we can try to improve the model. Therefore try to build a model by adding few layers if required and altering the activation functions.
Model 2
Trying to improve the performance of our model by adding new layers¶
# Improved base Model with more layers and advanced activation functions
def create_advanced_cnn_model(num_classes=2):
'''
Improve on the base model
change activation to Leaky Relu
adding global average polling
adding few more layers
'''
model = models.Sequential([
# input size
tf.keras.layers.Input(shape=(64, 64, 3)),
# Layer 1 with LeakyReLU and max pooling of 2x2
layers.Conv2D(32, (3, 3), activation='linear', padding='same'),
layers.LeakyReLU(alpha=0.1), # LeakyReLU activation
layers.MaxPooling2D((2, 2)),
# Layer 2 with Swish activtation and max pooling 2x2
layers.Conv2D(64, (3, 3), activation='linear', padding='same'),
layers.Activation(tf.keras.activations.swish),
layers.MaxPooling2D((2, 2)),
# laye 3 with LeakyReLU and alpha 0.1
layers.Conv2D(128, (3, 3), activation='linear', padding='same'),
layers.LeakyReLU(alpha=0.1),
# Adding a global average pooling
layers.GlobalAveragePooling2D(),
# Add a dense layer
layers.Dense(128, activation='relu'),
# add a drop out layer
layers.Dropout(0.5), # Dropout to reduce overfitting
# final layer with softmax activation
layers.Dense(num_classes, activation='softmax')
])
return model
Building the Model
# create the cnn_model_2 - a little advance model than the base model
cnn_model_2 = create_advanced_cnn_model(num_classes=2)
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/activations/leaky_relu.py:41: UserWarning: Argument `alpha` is deprecated. Use `negative_slope` instead. warnings.warn(
Compiling the model
# Compile the model with Adam optimizer and categorical crossentropy loss
cnn_model_2.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Summary of the model
cnn_model_2.summary()
Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_3 (Conv2D) │ (None, 64, 64, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ leaky_re_lu (LeakyReLU) │ (None, 64, 64, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_2 (MaxPooling2D) │ (None, 32, 32, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_4 (Conv2D) │ (None, 32, 32, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ activation (Activation) │ (None, 32, 32, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_3 (MaxPooling2D) │ (None, 16, 16, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_5 (Conv2D) │ (None, 16, 16, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ leaky_re_lu_1 (LeakyReLU) │ (None, 16, 16, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_average_pooling2d │ (None, 128) │ 0 │ │ (GlobalAveragePooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 128) │ 16,512 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (Dense) │ (None, 2) │ 258 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 110,018 (429.76 KB)
Trainable params: 110,018 (429.76 KB)
Non-trainable params: 0 (0.00 B)
Using Callbacks
# we are going to use the same callback - not defining any new one.
Fit and Train the model
# fit the model - in batch sizes.
cnn_model_2_hist = cnn_model_2.fit(train_ds_one_hot, batch_size=1000, epochs=20, validation_data= test_ds_one_hot, callbacks=[early_stopping,model_checkpoint])
cnn_model_2_hist.history
Epoch 1/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 136ms/step - accuracy: 0.5938 - loss: 0.6581 - val_accuracy: 0.6846 - val_loss: 0.5801 Epoch 2/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 136ms/step - accuracy: 0.7950 - loss: 0.4467 - val_accuracy: 0.9246 - val_loss: 0.1834 Epoch 3/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 137ms/step - accuracy: 0.9389 - loss: 0.1836 - val_accuracy: 0.9477 - val_loss: 0.1406 Epoch 4/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 137ms/step - accuracy: 0.9498 - loss: 0.1509 - val_accuracy: 0.9492 - val_loss: 0.1436 Epoch 5/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 134ms/step - accuracy: 0.9548 - loss: 0.1323 - val_accuracy: 0.9631 - val_loss: 0.1133 Epoch 6/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 135ms/step - accuracy: 0.9558 - loss: 0.1310 - val_accuracy: 0.9627 - val_loss: 0.1257 Epoch 7/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 138ms/step - accuracy: 0.9605 - loss: 0.1217 - val_accuracy: 0.9662 - val_loss: 0.1054 Epoch 8/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 139ms/step - accuracy: 0.9641 - loss: 0.1107 - val_accuracy: 0.9665 - val_loss: 0.1076 Epoch 9/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 34s 135ms/step - accuracy: 0.9634 - loss: 0.1154 - val_accuracy: 0.9677 - val_loss: 0.1041 Epoch 10/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 140ms/step - accuracy: 0.9667 - loss: 0.1000 - val_accuracy: 0.9727 - val_loss: 0.0890 Epoch 11/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 141ms/step - accuracy: 0.9681 - loss: 0.0918 - val_accuracy: 0.9765 - val_loss: 0.0772 Epoch 12/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 140ms/step - accuracy: 0.9708 - loss: 0.0885 - val_accuracy: 0.9769 - val_loss: 0.0972 Epoch 13/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 141ms/step - accuracy: 0.9730 - loss: 0.0875 - val_accuracy: 0.9762 - val_loss: 0.0852 Epoch 14/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 140ms/step - accuracy: 0.9737 - loss: 0.0818 - val_accuracy: 0.9785 - val_loss: 0.0832 Epoch 15/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 35s 140ms/step - accuracy: 0.9741 - loss: 0.0835 - val_accuracy: 0.9804 - val_loss: 0.0780 Epoch 16/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 36s 142ms/step - accuracy: 0.9738 - loss: 0.0810 - val_accuracy: 0.9777 - val_loss: 0.0814
{'accuracy': [0.6520554423332214,
0.8631701469421387,
0.943905770778656,
0.952880859375,
0.9576889276504517,
0.9578892588615417,
0.9622566103935242,
0.9638592600822449,
0.9663835167884827,
0.9693885445594788,
0.9701498746871948,
0.9722333550453186,
0.9740363955497742,
0.9741966724395752,
0.9743569493293762,
0.9752784967422485],
'loss': [0.624094545841217,
0.3308478593826294,
0.16857729852199554,
0.1441171020269394,
0.12621861696243286,
0.12333261221647263,
0.11484644562005997,
0.11001955717802048,
0.10514771193265915,
0.09214608371257782,
0.08986218273639679,
0.08342592418193817,
0.08477555960416794,
0.07910177856683731,
0.08064153045415878,
0.07524015754461288],
'val_accuracy': [0.6846153736114502,
0.9246153831481934,
0.947692334651947,
0.9492307901382446,
0.9630769491195679,
0.9626923203468323,
0.9661538600921631,
0.9665384888648987,
0.9676923155784607,
0.9726923108100891,
0.9765384793281555,
0.9769230484962463,
0.9761538505554199,
0.9784615635871887,
0.9803845882415771,
0.9776923060417175],
'val_loss': [0.580139696598053,
0.18341359496116638,
0.14055338501930237,
0.14358046650886536,
0.11326293647289276,
0.12568040192127228,
0.10540124773979187,
0.10763168334960938,
0.10413549840450287,
0.08899719268083572,
0.07719162106513977,
0.09718514233827591,
0.08521749824285507,
0.08324695378541946,
0.07798149436712265,
0.08142384141683578]}
Evaluating the model
# evaluate test and training data set for accuracy and loss - for model 2. these are very valuable attributes to compare the model.
e_train,e_test = evaluate_and_print_accuracy( cnn_model_2 )
model_list.append( ("Model 2: cnn_model_2",e_train,e_test) )
len(model_list)
# show the model list so far.
show_models_so_far()
250/250 ━━━━━━━━━━━━━━━━━━━━ 12s 47ms/step - accuracy: 0.9718 - loss: 0.0839 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 45ms/step - accuracy: 0.9776 - loss: 0.0746 Training Accuracy: 0.9732751250267029, Loss: 0.07916884124279022 Validation Accuracy: 0.9765384793281555, Loss: 0.07719162106513977 ('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127]) Model 1: cnn_base_model: test-accuracy:0.9807692170143127 Model 2: cnn_model_2: test-accuracy:0.9765384793281555 The best model: Model 1: cnn_base_model, with test-accuracy: 0.9807692170143127
Plotting the confusion matrix
Plotting the train and the validation curves
# lets predict and compute the confusion matrix
y_true, y_pred = get_true_and_pred_labels(test_ds_one_hot, cnn_model_2)
cm = confusion_matrix(y_true, y_pred)
plot_confusion_matrix( cm )
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 17ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step
# plot training and validation graph for model 2.
plot_training_and_validation_graph( cnn_model_2_hist )
Observations: Model 2
This model is successful in terms of modeling the accuracy and loss. This model also runs full life cycle of 20 epochs. The full run absolutely trains model better. Alongside it also improves accurcay and loss.
The confustion metrix show promising numbers on errors compared to based model. However the model accuracy on the test set is better - the confusion errors are little more than the base model. The accuracy is better than the base model
Performance of Model 2
- computation speed : 35 second per epoch
- Training Accuracy : 0.9718, Loss: 0.0839
- Validation Accuracy : 0.9776, Loss: 0.0746
- False Positive : 30
- False Negative : 31
- Total errors : 61 out of 2600
- Results : Somehow it does not improve on base model
A training and validation accuracy toward the end of the epoch goes hand in hand it consolidates pretty good.
Think about it:
Now let's build a model with LeakyRelu as the activation function
- Can the model performance be improved if we change our activation function to LeakyRelu?
- Can BatchNormalization improve our model?
Let us try to build a model using BatchNormalization and using LeakyRelu as our activation function.
Model 3 with Batch Normalization with Sqeeze and Excite Block
# SE Block Definition
def squeeze_excite_block(input_tensor, ratio=8):
'''
adding squeeze and excite block to further advance the model
'''
channels = input_tensor.shape[-1]
se = layers.GlobalAveragePooling2D()(input_tensor)
se = layers.Reshape((1, 1, channels))(se)
se = layers.Dense(channels // ratio, activation='relu')(se)
se = layers.Dense(channels, activation='sigmoid')(se)
se = layers.Multiply()([input_tensor, se])
return se
# Enhanced CNN Model with Batch Normalization, Dropout, and additional layers
def create_advanced_cnn2_model(num_classes=2):
input_layer = tf.keras.layers.Input(shape=(64, 64, 3))
# Convolutional Layer 1 with LeakyReLU and Batch Normalization
x = layers.Conv2D(32, (3, 3), activation='linear', padding='same')(input_layer)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.1)(x) # LeakyReLU activation
x = layers.MaxPooling2D((2, 2))(x)
# Convolutional Layer 2 with Swish and Batch Normalization
x = layers.Conv2D(64, (3, 3), activation='linear', padding='same')(x)
x = layers.BatchNormalization()(x)
x = layers.Activation(tf.keras.activations.swish)(x) # Swish activation
x = layers.MaxPooling2D((2, 2))(x)
# Convolutional Layer 3 with LeakyReLU and Batch Normalization
x = layers.Conv2D(128, (3, 3), activation='linear', padding='same')(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.1)(x)
# Convolutional Layer 4 (Added) with Batch Normalization and LeakyReLU
x = layers.Conv2D(128, (3, 3), activation='linear', padding='same')(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.1)(x)
# Apply Squeeze and Excitation block here (directly to the output tensor)
x = squeeze_excite_block(x, ratio=8)
# Global Average Pooling
x = layers.GlobalAveragePooling2D()(x)
# Fully Connected Layer with Dropout
x = layers.Dense(128, activation='relu')(x)
x = layers.Dropout(0.5)(x) # Dropout to reduce overfitting
# Additional Fully Connected Layer (Added)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dropout(0.5)(x)
# Output Layer (softmax for classification)
output_layer = layers.Dense(num_classes, activation='softmax')(x)
# Define the model
model = models.Model(inputs=input_layer, outputs=output_layer)
return model
Building the Model
# Compile this advance model with squeeze and excite block
cnn_adv2_model = create_advanced_cnn2_model(num_classes=2)
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/activations/leaky_relu.py:41: UserWarning: Argument `alpha` is deprecated. Use `negative_slope` instead. warnings.warn(
Compiling the model
# Compile the model with Adam optimizer and categorical crossentropy loss
cnn_adv2_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Print the summary of the model
cnn_adv2_model.summary()
Model: "functional_2"
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ │ input_layer_2 │ (None, 64, 64, 3) │ 0 │ - │ │ (InputLayer) │ │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_6 (Conv2D) │ (None, 64, 64, │ 896 │ input_layer_2[0]… │ │ │ 32) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ batch_normalization │ (None, 64, 64, │ 128 │ conv2d_6[0][0] │ │ (BatchNormalizatio… │ 32) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ leaky_re_lu_2 │ (None, 64, 64, │ 0 │ batch_normalizat… │ │ (LeakyReLU) │ 32) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ max_pooling2d_4 │ (None, 32, 32, │ 0 │ leaky_re_lu_2[0]… │ │ (MaxPooling2D) │ 32) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_7 (Conv2D) │ (None, 32, 32, │ 18,496 │ max_pooling2d_4[… │ │ │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ batch_normalizatio… │ (None, 32, 32, │ 256 │ conv2d_7[0][0] │ │ (BatchNormalizatio… │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ activation_1 │ (None, 32, 32, │ 0 │ batch_normalizat… │ │ (Activation) │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ max_pooling2d_5 │ (None, 16, 16, │ 0 │ activation_1[0][… │ │ (MaxPooling2D) │ 64) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_8 (Conv2D) │ (None, 16, 16, │ 73,856 │ max_pooling2d_5[… │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ batch_normalizatio… │ (None, 16, 16, │ 512 │ conv2d_8[0][0] │ │ (BatchNormalizatio… │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ leaky_re_lu_3 │ (None, 16, 16, │ 0 │ batch_normalizat… │ │ (LeakyReLU) │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ conv2d_9 (Conv2D) │ (None, 16, 16, │ 147,584 │ leaky_re_lu_3[0]… │ │ │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ batch_normalizatio… │ (None, 16, 16, │ 512 │ conv2d_9[0][0] │ │ (BatchNormalizatio… │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ leaky_re_lu_4 │ (None, 16, 16, │ 0 │ batch_normalizat… │ │ (LeakyReLU) │ 128) │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ global_average_poo… │ (None, 128) │ 0 │ leaky_re_lu_4[0]… │ │ (GlobalAveragePool… │ │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ reshape (Reshape) │ (None, 1, 1, 128) │ 0 │ global_average_p… │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_4 (Dense) │ (None, 1, 1, 16) │ 2,064 │ reshape[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_5 (Dense) │ (None, 1, 1, 128) │ 2,176 │ dense_4[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ multiply (Multiply) │ (None, 16, 16, │ 0 │ leaky_re_lu_4[0]… │ │ │ 128) │ │ dense_5[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ global_average_poo… │ (None, 128) │ 0 │ multiply[0][0] │ │ (GlobalAveragePool… │ │ │ │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_6 (Dense) │ (None, 128) │ 16,512 │ global_average_p… │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dropout_1 (Dropout) │ (None, 128) │ 0 │ dense_6[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_7 (Dense) │ (None, 64) │ 8,256 │ dropout_1[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dropout_2 (Dropout) │ (None, 64) │ 0 │ dense_7[0][0] │ ├─────────────────────┼───────────────────┼────────────┼───────────────────┤ │ dense_8 (Dense) │ (None, 2) │ 130 │ dropout_2[0][0] │ └─────────────────────┴───────────────────┴────────────┴───────────────────┘
Total params: 271,378 (1.04 MB)
Trainable params: 270,674 (1.03 MB)
Non-trainable params: 704 (2.75 KB)
Using callbacks
# will use the same call backs
Fit and train the model
# fit the model for the training dataset
cnn_adv2_model_hist = cnn_adv2_model.fit(train_ds_one_hot, batch_size=1000, epochs=20, validation_data= test_ds_one_hot, callbacks=[early_stopping,model_checkpoint] )
cnn_adv2_model_hist.history
Epoch 1/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 85s 336ms/step - accuracy: 0.7733 - loss: 0.4479 - val_accuracy: 0.5088 - val_loss: 3.2003 Epoch 2/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 76s 303ms/step - accuracy: 0.9658 - loss: 0.1096 - val_accuracy: 0.8981 - val_loss: 0.2321 Epoch 3/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 76s 304ms/step - accuracy: 0.9743 - loss: 0.0857 - val_accuracy: 0.9269 - val_loss: 0.1883 Epoch 4/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 76s 305ms/step - accuracy: 0.9731 - loss: 0.0851 - val_accuracy: 0.9658 - val_loss: 0.1199 Epoch 5/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 77s 306ms/step - accuracy: 0.9772 - loss: 0.0717 - val_accuracy: 0.9773 - val_loss: 0.0868 Epoch 6/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 79s 314ms/step - accuracy: 0.9793 - loss: 0.0671 - val_accuracy: 0.9708 - val_loss: 0.1433 Epoch 7/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 78s 311ms/step - accuracy: 0.9791 - loss: 0.0667 - val_accuracy: 0.9769 - val_loss: 0.0764 Epoch 8/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 78s 313ms/step - accuracy: 0.9799 - loss: 0.0612 - val_accuracy: 0.9019 - val_loss: 0.2587 Epoch 9/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 80s 319ms/step - accuracy: 0.9798 - loss: 0.0592 - val_accuracy: 0.9762 - val_loss: 0.0868 Epoch 10/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 79s 315ms/step - accuracy: 0.9797 - loss: 0.0576 - val_accuracy: 0.9838 - val_loss: 0.0830 Epoch 11/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 80s 319ms/step - accuracy: 0.9785 - loss: 0.0586 - val_accuracy: 0.9792 - val_loss: 0.0744 Epoch 12/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 83s 330ms/step - accuracy: 0.9808 - loss: 0.0555 - val_accuracy: 0.9815 - val_loss: 0.0747 Epoch 13/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 79s 316ms/step - accuracy: 0.9795 - loss: 0.0575 - val_accuracy: 0.9727 - val_loss: 0.1008 Epoch 14/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 78s 311ms/step - accuracy: 0.9831 - loss: 0.0515 - val_accuracy: 0.9815 - val_loss: 0.0866 Epoch 15/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 81s 322ms/step - accuracy: 0.9825 - loss: 0.0502 - val_accuracy: 0.9812 - val_loss: 0.0985 Epoch 16/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 79s 314ms/step - accuracy: 0.9816 - loss: 0.0486 - val_accuracy: 0.9846 - val_loss: 0.0506 Epoch 17/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 91s 365ms/step - accuracy: 0.9823 - loss: 0.0511 - val_accuracy: 0.9819 - val_loss: 0.0602 Epoch 18/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 82s 328ms/step - accuracy: 0.9826 - loss: 0.0479 - val_accuracy: 0.9827 - val_loss: 0.0745 Epoch 19/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 81s 325ms/step - accuracy: 0.9836 - loss: 0.0443 - val_accuracy: 0.9792 - val_loss: 0.1054 Epoch 20/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 81s 322ms/step - accuracy: 0.9812 - loss: 0.0484 - val_accuracy: 0.9835 - val_loss: 0.0683
{'accuracy': [0.8777145743370056,
0.9695488214492798,
0.9747976660728455,
0.976159930229187,
0.9784838557243347,
0.9804471731185913,
0.9803269505500793,
0.9804471731185913,
0.9805272817611694,
0.9812484979629517,
0.9802868962287903,
0.9815690517425537,
0.9811282753944397,
0.9831316471099854,
0.9829713702201843,
0.9834121465682983,
0.9827710390090942,
0.9826909303665161,
0.9830915927886963,
0.9825306534767151],
'loss': [0.2831425070762634,
0.09857438504695892,
0.08275604248046875,
0.07653523981571198,
0.0657389834523201,
0.06006051227450371,
0.05978575348854065,
0.058588072657585144,
0.055159348994493484,
0.05165781080722809,
0.053375452756881714,
0.051349617540836334,
0.050737105309963226,
0.04783205687999725,
0.047604579478502274,
0.04524710401892662,
0.04830053448677063,
0.0458974726498127,
0.044894274324178696,
0.04496600851416588],
'val_accuracy': [0.5088461637496948,
0.8980769515037537,
0.9269230961799622,
0.9657692313194275,
0.9773076772689819,
0.9707692265510559,
0.9769230484962463,
0.9019230604171753,
0.9761538505554199,
0.983846127986908,
0.9792307615280151,
0.9815384745597839,
0.9726923108100891,
0.9815384745597839,
0.9811538457870483,
0.9846153855323792,
0.9819231033325195,
0.982692301273346,
0.9792307615280151,
0.9834615588188171],
'val_loss': [3.2003440856933594,
0.23214231431484222,
0.188303604722023,
0.11992118507623672,
0.0867530032992363,
0.1432863026857376,
0.0763697549700737,
0.25872141122817993,
0.08675388246774673,
0.08298096060752869,
0.07438519597053528,
0.0746896043419838,
0.10084319859743118,
0.08660412579774857,
0.09847318381071091,
0.05055966228246689,
0.06024990975856781,
0.07449560612440109,
0.10541775077581406,
0.0682770311832428]}
Plotting the train and validation accuracy
#cache the model list - for final result display and evaluate the test and training data set for accuracy.
e_train, e_test = evaluate_and_print_accuracy(cnn_adv2_model)
model_list.append( ("Model 3: cnn_adv2_model",e_train,e_test) )
len(model_list)
show_models_so_far()
250/250 ━━━━━━━━━━━━━━━━━━━━ 21s 82ms/step - accuracy: 0.9812 - loss: 0.0623 26/26 ━━━━━━━━━━━━━━━━━━━━ 2s 82ms/step - accuracy: 0.9856 - loss: 0.0451 Training Accuracy: 0.982650876045227, Loss: 0.05575656518340111 Validation Accuracy: 0.9846153855323792, Loss: 0.05055966228246689 ('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127]) Model 1: cnn_base_model: test-accuracy:0.9807692170143127 Model 2: cnn_model_2: test-accuracy:0.9765384793281555 Model 3: cnn_adv2_model: test-accuracy:0.9846153855323792 The best model: Model 3: cnn_adv2_model, with test-accuracy: 0.9846153855323792
# Get true and predicted labels from the test dataset
y_true, y_pred = get_true_and_pred_labels(test_ds_one_hot, cnn_adv2_model )
# Compute confusion matrix
cm = confusion_matrix(y_true, y_pred)
# plot the confusino metrics for the model 3
plot_confusion_matrix( cm )
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 25ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 20ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 19ms/step
Evaluating the model
plot_training_and_validation_graph( cnn_adv2_model_hist )
Generate the classification report and confusion matrix
Observations and insights:
Observations: Model 3 : A model with squeeze and excite layer
This model definitely performs better than the previous two models - especially looking at the absolute numbers of the errors (including false negative + false positive). Their certain epochs are not on trend of improving its accuracy as well as loss but overall the model flattens out fine.
The only drawback is training the model for the dataset - it is higher side on time taken in steps compared to model 2 - which can be improved by going with high-compute if it is cheap. Otherwise it can be one concern.
Performance of Model 3 with squeeze and excite layer
- computation speed : 85s per epoch
- Training Accuracy : 0.9812, Loss: 0.0623
- Validation Accuracy : 0.9856, Loss: 0.0451
- False Positive : 16
- False Negative : 24
- Total errors : 40 out of 2600
- Results : So far the best model - in terms of validation accuracy as well as confusion matrix. The performance is degraded compared to model 2 but absolute number is better - there are not very different.
Think About It :
- Can we improve the model with Image Data Augmentation?
- References to image data augmentation can be seen below:
Model 4: Data Augmentation Layer
# nothing here
Use image data generator
# The images are stored in TensorFlow.MapData set - which combines iamges, label into various batch
# using ImageGenerator is not available directly. Instead the below function applies the image augmentation
# using random morphing of the image with certain feature os image transformation. This function
# will be used as mapping function of the original image and then we will use the same image for our training
# of the data.
# Define a function for augmentations (this will act like the "map" function)
def data_augmentation(image, label):
# Apply random transformations here
image = tf.image.random_flip_left_right(image)
image = tf.image.random_flip_up_down(image)
image = tf.image.random_brightness(image, max_delta=0.2)
image = tf.image.random_contrast(image, lower=0.7, upper=1.3)
return image, label
# lets map the training data set with the image augmentaiton
# this shall be used in the model fitting.. no need to augment the testing dataset right
ds_augmented = train_ds.map(data_augmentation)
Think About It :
- Check if the performance of the model can be improved by changing different parameters in the ImageDataGenerator.
Visualizing Augmented images
# show 3x3 data augmented layer - so that we can see if it makes more sense - or can be easily differentiate
show_3_by_3_image_of_dataset(ds_augmented)
# show 6x6 images of augmented images.
show_6_by_6_image_of_dataset(ds_augmented)
Observations and insights:
It's a definite improvment certain - you can see the images little different than the original RGB images - lets give it a try to use the data augmented images into model to see how it performs. The real test of the model performance and error rate shall provide use more idea.
Building the Model with Image augmentation image¶
# Improved Adv Model with data augmentation layer
def create_advanced3_cnn_model(num_classes=2):
'''
this is the model 1 - after the base model becuase its performance is on par
with the model 2
'''
model = models.Sequential([
# input size
tf.keras.layers.Input(shape=(64, 64, 3)),
# Layer 1 with LeakyReLU and max pooling of 2x2
layers.Conv2D(32, (3, 3), activation='linear', padding='same'),
layers.LeakyReLU(alpha=0.1), # LeakyReLU activation
layers.MaxPooling2D((2, 2)),
# Layer 2 with Swish activtation and max pooling 2x2
layers.Conv2D(64, (3, 3), activation='linear', padding='same'),
layers.Activation(tf.keras.activations.swish),
layers.MaxPooling2D((2, 2)),
# Layer 3 with LeakyReLU and alpha 0.1
layers.Conv2D(128, (3, 3), activation='linear', padding='same'),
layers.LeakyReLU(alpha=0.1),
# Adding a global average pooling
layers.GlobalAveragePooling2D(),
# Add a dense layer
layers.Dense(128, activation='relu'),
# Add a dropout layer
layers.Dropout(0.5), # Dropout to reduce overfitting
# Final layer with softmax activation
layers.Dense(num_classes, activation='softmax')
])
return model
Using Callbacks
# not going to use any callbacks for stopping the model - lets go through the 20 epochs
Fit and Train the model
# Map the data augmentation function onto the dataset
train_ds_augmented = train_ds_one_hot.map(data_augmentation)
# Create the model which is model 2 - only
cnn_adv3_model = create_advanced3_cnn_model(num_classes=2)
# Compile the model
cnn_adv3_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model using augmented data it only training data shall be augmented
# for the testing we shall use the regular data.
cnn3_augmented_model_history = cnn_adv3_model.fit(train_ds_augmented, epochs=20,batch_size=1000, validation_data= test_ds_one_hot )
Epoch 1/20
/opt/anaconda3/lib/python3.12/site-packages/keras/src/layers/activations/leaky_relu.py:41: UserWarning: Argument `alpha` is deprecated. Use `negative_slope` instead. warnings.warn(
250/250 ━━━━━━━━━━━━━━━━━━━━ 42s 166ms/step - accuracy: 0.5744 - loss: 0.6728 - val_accuracy: 0.7215 - val_loss: 0.5997 Epoch 2/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 41s 166ms/step - accuracy: 0.7953 - loss: 0.4779 - val_accuracy: 0.9065 - val_loss: 0.1990 Epoch 3/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 41s 165ms/step - accuracy: 0.9005 - loss: 0.2663 - val_accuracy: 0.9092 - val_loss: 0.2358 Epoch 4/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 40s 159ms/step - accuracy: 0.9287 - loss: 0.2037 - val_accuracy: 0.9435 - val_loss: 0.1529 Epoch 5/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 41s 165ms/step - accuracy: 0.9441 - loss: 0.1633 - val_accuracy: 0.9469 - val_loss: 0.1480 Epoch 6/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 41s 162ms/step - accuracy: 0.9439 - loss: 0.1616 - val_accuracy: 0.9500 - val_loss: 0.1348 Epoch 7/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 43s 172ms/step - accuracy: 0.8970 - loss: 0.2699 - val_accuracy: 0.9473 - val_loss: 0.1408 Epoch 8/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 40s 160ms/step - accuracy: 0.9487 - loss: 0.1470 - val_accuracy: 0.9504 - val_loss: 0.1229 Epoch 9/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 39s 154ms/step - accuracy: 0.9481 - loss: 0.1400 - val_accuracy: 0.9635 - val_loss: 0.1107 Epoch 10/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 36s 145ms/step - accuracy: 0.9584 - loss: 0.1176 - val_accuracy: 0.9581 - val_loss: 0.1208 Epoch 11/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 37s 146ms/step - accuracy: 0.9556 - loss: 0.1233 - val_accuracy: 0.9627 - val_loss: 0.1089 Epoch 12/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 38s 149ms/step - accuracy: 0.9588 - loss: 0.1142 - val_accuracy: 0.9692 - val_loss: 0.0981 Epoch 13/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 38s 152ms/step - accuracy: 0.9637 - loss: 0.1045 - val_accuracy: 0.9650 - val_loss: 0.0974 Epoch 14/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 38s 151ms/step - accuracy: 0.9636 - loss: 0.1053 - val_accuracy: 0.9685 - val_loss: 0.0918 Epoch 15/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 38s 153ms/step - accuracy: 0.9655 - loss: 0.0978 - val_accuracy: 0.9696 - val_loss: 0.0935 Epoch 16/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 38s 151ms/step - accuracy: 0.9653 - loss: 0.1065 - val_accuracy: 0.9692 - val_loss: 0.1171 Epoch 17/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 40s 159ms/step - accuracy: 0.9598 - loss: 0.1274 - val_accuracy: 0.9781 - val_loss: 0.0766 Epoch 18/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 37s 148ms/step - accuracy: 0.9716 - loss: 0.0823 - val_accuracy: 0.9765 - val_loss: 0.0752 Epoch 19/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 37s 146ms/step - accuracy: 0.9738 - loss: 0.0815 - val_accuracy: 0.9746 - val_loss: 0.0803 Epoch 20/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 37s 146ms/step - accuracy: 0.9733 - loss: 0.0814 - val_accuracy: 0.9785 - val_loss: 0.0644
# print model summary for the augmented model - it is based on model 2 only the input is different.
cnn_adv3_model.summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d_10 (Conv2D) │ (None, 64, 64, 32) │ 896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ leaky_re_lu_5 (LeakyReLU) │ (None, 64, 64, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_6 (MaxPooling2D) │ (None, 32, 32, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_11 (Conv2D) │ (None, 32, 32, 64) │ 18,496 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ activation_2 (Activation) │ (None, 32, 32, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_7 (MaxPooling2D) │ (None, 16, 16, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_12 (Conv2D) │ (None, 16, 16, 128) │ 73,856 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ leaky_re_lu_6 (LeakyReLU) │ (None, 16, 16, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_average_pooling2d_3 │ (None, 128) │ 0 │ │ (GlobalAveragePooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_9 (Dense) │ (None, 128) │ 16,512 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_3 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_10 (Dense) │ (None, 2) │ 258 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 330,056 (1.26 MB)
Trainable params: 110,018 (429.76 KB)
Non-trainable params: 0 (0.00 B)
Optimizer params: 220,038 (859.53 KB)
Evaluating the model
Plot the train and validation accuracy
e_train, e_test = evaluate_and_print_accuracy( cnn_adv3_model )
model_list.append( ("Model 4: cnn_adv3_model (image aug)",e_train,e_test) )
len(model_list)
show_models_so_far()
250/250 ━━━━━━━━━━━━━━━━━━━━ 11s 43ms/step - accuracy: 0.9770 - loss: 0.0658 26/26 ━━━━━━━━━━━━━━━━━━━━ 1s 44ms/step - accuracy: 0.9767 - loss: 0.0740 Training Accuracy: 0.9776424169540405, Loss: 0.06338279694318771 Validation Accuracy: 0.9784615635871887, Loss: 0.06436709314584732 ('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127]) Model 1: cnn_base_model: test-accuracy:0.9807692170143127 Model 2: cnn_model_2: test-accuracy:0.9765384793281555 Model 3: cnn_adv2_model: test-accuracy:0.9846153855323792 Model 4: cnn_adv3_model (image aug): test-accuracy:0.9784615635871887 The best model: Model 3: cnn_adv2_model, with test-accuracy: 0.9846153855323792
Plotting the classification report and confusion matrix
# Get true and predicted labels from the test dataset for the augmented model
y_true, y_pred = get_true_and_pred_labels(test_ds_one_hot, cnn_adv3_model)
# Compute confusion matrix
cm = confusion_matrix(y_true, y_pred)
# plot a chart for the confusion matrix
plot_confusion_matrix( cm )
4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 21ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step
Observations: Model 4 : A model with Image Augmentaion
A presentation of modified images was really interesting on the previous chart - how if we look at the errors - especially false positive/negative numbers have gone up. The same model performed well on regular RGB but here it is shown to have performance degradation on average too. It is a good experiment but definitely did not perform on the error level as contender.
Performance of Advance Model 4
- computation speed : 40s per epoch
- Training Accuracy : 0.9770, Loss: 0.0658
- Validation Accuracy : 0.9767, Loss: 0.0740
- False Positive : 37
- False Negative : 19
- Total errors : 56 out of 2600
- Results : Can not beat Model 2 and Model 3's accuracy. confusion matrix is little high also the accuracy is less - it does not worth the effort to image augment for the base 2 model - which does work better.
Now, let us try to use a pretrained model like VGG16 and check how it performs on our data.
Model 5 : Pre-trained VGG16 model
Pre-trained model (VGG16)¶
- Import VGG16 network upto any layer you choose
- Add Fully Connected Layers on top of it
from tensorflow.keras.applications import VGG16
# creates VGG16 model with pre-trained weights from ImageNet, excluding the top classification layers
def create_vgg16_model(num_classes=2):
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))
# Freeze the layers in the base VGG16 model to avoid retraining them
for layer in base_model.layers:
layer.trainable = False
model = models.Sequential([
base_model,
# add globalAverage pooling layer
layers.GlobalAveragePooling2D(),
# add relu activation
layers.Dense(128, activation='relu'),
# reduce overfitting
layers.Dropout(0.5),
# final classification layer
layers.Dense(num_classes, activation='softmax')
])
return model
Compiling the model
# 2. Create the VGG16 model
cnn_vgg16_model = create_vgg16_model(num_classes=2)
# 3. Compile the model
cnn_vgg16_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
using callbacks
# using the same callbacks.
cnn_vgg16_model.summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ vgg16 (Functional) │ (None, 2, 2, 512) │ 14,714,688 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_average_pooling2d_4 │ (None, 512) │ 0 │ │ (GlobalAveragePooling2D) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_11 (Dense) │ (None, 128) │ 65,664 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_4 (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_12 (Dense) │ (None, 2) │ 258 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 14,780,610 (56.38 MB)
Trainable params: 65,922 (257.51 KB)
Non-trainable params: 14,714,688 (56.13 MB)
Fit and Train the model
# 4. Fit the model using normalized train and test data for validation
vgg16_model_history = cnn_vgg16_model.fit(
train_ds_one_hot,
epochs=20,
batch_size = 1000,
validation_data=test_ds_one_hot
)
Epoch 1/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 345s 1s/step - accuracy: 0.8065 - loss: 0.4037 - val_accuracy: 0.9235 - val_loss: 0.1975 Epoch 2/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 1481s 6s/step - accuracy: 0.9223 - loss: 0.2018 - val_accuracy: 0.9331 - val_loss: 0.1666 Epoch 3/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 1587s 6s/step - accuracy: 0.9308 - loss: 0.1799 - val_accuracy: 0.9354 - val_loss: 0.1706 Epoch 4/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 219s 877ms/step - accuracy: 0.9344 - loss: 0.1694 - val_accuracy: 0.9415 - val_loss: 0.1458 Epoch 5/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 677s 3s/step - accuracy: 0.9392 - loss: 0.1619 - val_accuracy: 0.9385 - val_loss: 0.1460 Epoch 6/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 221s 883ms/step - accuracy: 0.9418 - loss: 0.1554 - val_accuracy: 0.9431 - val_loss: 0.1422 Epoch 7/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 228s 911ms/step - accuracy: 0.9431 - loss: 0.1549 - val_accuracy: 0.9327 - val_loss: 0.1594 Epoch 8/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 231s 922ms/step - accuracy: 0.9430 - loss: 0.1515 - val_accuracy: 0.9477 - val_loss: 0.1395 Epoch 9/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 233s 931ms/step - accuracy: 0.9444 - loss: 0.1518 - val_accuracy: 0.9412 - val_loss: 0.1432 Epoch 10/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 234s 936ms/step - accuracy: 0.9458 - loss: 0.1442 - val_accuracy: 0.9458 - val_loss: 0.1356 Epoch 11/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 236s 943ms/step - accuracy: 0.9474 - loss: 0.1459 - val_accuracy: 0.9477 - val_loss: 0.1317 Epoch 12/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 236s 945ms/step - accuracy: 0.9479 - loss: 0.1433 - val_accuracy: 0.9469 - val_loss: 0.1365 Epoch 13/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 237s 947ms/step - accuracy: 0.9477 - loss: 0.1417 - val_accuracy: 0.9485 - val_loss: 0.1323 Epoch 14/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 521s 2s/step - accuracy: 0.9485 - loss: 0.1420 - val_accuracy: 0.9504 - val_loss: 0.1283 Epoch 15/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 324s 1s/step - accuracy: 0.9486 - loss: 0.1392 - val_accuracy: 0.9485 - val_loss: 0.1313 Epoch 16/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 180s 719ms/step - accuracy: 0.9492 - loss: 0.1383 - val_accuracy: 0.9496 - val_loss: 0.1289 Epoch 17/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 183s 732ms/step - accuracy: 0.9499 - loss: 0.1367 - val_accuracy: 0.9504 - val_loss: 0.1286 Epoch 18/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 185s 742ms/step - accuracy: 0.9503 - loss: 0.1347 - val_accuracy: 0.9465 - val_loss: 0.1308 Epoch 19/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 960s 4s/step - accuracy: 0.9484 - loss: 0.1365 - val_accuracy: 0.9392 - val_loss: 0.1392 Epoch 20/20 250/250 ━━━━━━━━━━━━━━━━━━━━ 213s 851ms/step - accuracy: 0.9504 - loss: 0.1355 - val_accuracy: 0.9523 - val_loss: 0.1309
Plot the train and validation accuracy
plot_training_and_validation_graph(vgg16_model_history)
Observations and insights:¶
- What can be observed from the validation and train curves?
As the epoch progress we see the consolidation of accuracy agains the val_accuracy. However the loss is still open and does not fully match with the val_loss that means there might be more errors on the model in confusion matrix. the performance is the worse of all model - it is using the imagenet weights - if we try with unknown - weights the performance goes even down further.
Evaluating the model
# evaluation of the model 5 - vgg16 and its true label comparisions.
e_train,e_test = evaluate_and_print_accuracy( cnn_vgg16_model )
model_list.append( ("Model 5: cnn_vgg16_model",e_train,e_test) )
len(model_list)
show_models_so_far()
250/250 ━━━━━━━━━━━━━━━━━━━━ 206s 825ms/step - accuracy: 0.9516 - loss: 0.1313 26/26 ━━━━━━━━━━━━━━━━━━━━ 22s 827ms/step - accuracy: 0.9580 - loss: 0.1239 Training Accuracy: 0.9518390893936157, Loss: 0.12964138388633728 Validation Accuracy: 0.9523077011108398, Loss: 0.13092462718486786 ('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127]) Model 1: cnn_base_model: test-accuracy:0.9807692170143127 Model 2: cnn_model_2: test-accuracy:0.9765384793281555 Model 3: cnn_adv2_model: test-accuracy:0.9846153855323792 Model 4: cnn_adv3_model (image aug): test-accuracy:0.9784615635871887 Model 5: cnn_vgg16_model: test-accuracy:0.9523077011108398 The best model: Model 3: cnn_adv2_model, with test-accuracy: 0.9846153855323792
Plotting the classification report and confusion matrix
# Get true and predicted labels from the test dataset
y_true, y_pred = get_true_and_pred_labels(test_ds_one_hot, cnn_vgg16_model)
# Compute confusion matrix
cm = confusion_matrix(y_true, y_pred)
plot_confusion_matrix( cm )
4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 215ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 193ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 195ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 196ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 194ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 202ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 196ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 196ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 196ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 195ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 198ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 196ms/step 4/4 ━━━━━━━━━━━━━━━━━━━━ 1s 197ms/step
Observations: Model 5 : a vgg16 model
I expected the performance to be better but first - when i performed the model fit on the first time - it was very slow on the average 20s per steps on epoch. Then i modified to use the imagenet weights it did much better at 964 ms per steps on epoch.
However compared to all the model this has the worse performance. the confusion matrix also show the worse result.
Training Accuracy: 0.9562865495681763, Loss: 0.11671894788742065 Validation Accuracy: 0.9449999928474426, Loss: 0.14911305904388428Performance of Pre-trained Model#5 (VGG16)
- computation speed : 300s per epoch
- Training Accuracy : 0.9516, Loss: 0.1313
- Validation Accuracy : 0.9580, Loss: 0.1239
- False Positive : 29
- False Negative : 95
- Total errors : 134 out of 2600
this is the worse performing model in terms of time. May be there might be better way to use this model by tuning the weights or hyper parameters. However in terms of graph - training and evaluation accuracy - the final results are way off - you can see the impact of the same in confusion matrics as well as the error rate.
The time taken by this model to fit the trainig data set is exhorbitant keep in mind for the custom weight the time is even worse. Notice training vs validation chart the final epochs they are diverging significantly and you can see it opening up.
Think about it:¶
- What observations and insights can be drawn from the confusion matrix and classification report?
- Choose the model with the best accuracy scores from all the above models and save it as a final model.
print("The final comparision of the total 5 models so far:")
show_models_so_far()
The final comparision of the total 5 models so far:
('Model 1: cnn_base_model', [0.04104594141244888, 0.985575795173645], [0.05401621386408806, 0.9807692170143127])
Model 1: cnn_base_model: test-accuracy:0.9807692170143127
Model 2: cnn_model_2: test-accuracy:0.9765384793281555
Model 3: cnn_adv2_model: test-accuracy:0.9846153855323792
Model 4: cnn_adv3_model (image aug): test-accuracy:0.9784615635871887
Model 5: cnn_vgg16_model: test-accuracy:0.9523077011108398
The best model: Model 3: cnn_adv2_model, with test-accuracy: 0.9846153855323792
Observations and Conclusions drawn from the final model:¶
The final comparision of the total 5 models so far:
- Model 1: cnn_base_model: test-accuracy:0.9807692170143127
- Model 2: cnn_model_2: test-accuracy:0.9765384793281555
- Model 3: cnn_adv2_model: test-accuracy:0.9846153855323792
- Model 4: cnn_adv3_model (image aug): test-accuracy:0.9784615635871887
- Model 5: cnn_vgg16_model: test-accuracy:0.9523077011108398
The best model: Model 3: cnn_adv2_model, with test-accuracy: 0.9846153855323792¶
For the conclusion: the model 3 - with sequeeze and excite block is the best performing mmodel. A model 2 comes at the second best in terms of accuracy. The confusion matrix shows lowest false values as well as accuracy of the model at 0.986 on test validation.
I prefer the model 3 over rest pof the model as its performance is great. It has its accuracy at best one of the least fp/fn as well on the confustion metrix. A leakyRelu activation with squeeze and excited block does not burden the performance and at the same time smoothes the training of the model over number of epochs.
Model 3 - with Sqeeze and Excite is my final choice for this project.
Final Model : Model 3 - with Sequeeze and Excite block
Improvements that can be done:
- Can the model performance be improved using other pre-trained models or different CNN architecture?
- You can try to build a model using these HSV images and compare them with your other models.
Insights¶
Refined insights:¶
- What are the most meaningful insights from the data relevant to the problem?
If there is clear way to distinguish the patterns in the image - there is a possibilities for the model to work. The data provided was balance so not much data preparation needed - however in real life problems one has to think about preparing data.
Also applying HSV and blurring didn't help in this case - especially as RGB image were most distinguishable images so the model performing on these images are definitly showing better results.
A image augmentation did help to notice subtle differences more clear - which means in real life problem one has to try different way to visual image. One trial is to do may be monochrome image and see if there are any patterns that can be derived.
Comparison of various techniques and their relative performance:¶
- How do different techniques perform? Which one is performing relatively better? Is there scope to improve the performance further?
One thing i noticed is based CNN model with LeakyRelu performed - in model 2 - a best - i am going with Model 3 - but the performance of Model 3 is little off compared to model 2. However their results are pretty similar. In terms of problem solving you may want to stick with a model that is performing faster - so that multiple iterations can be done - without wasting your time and effort.
Once fine tuned - it will be better to add few complication on top which you surely know that is not going to degraded performance
On model 3 - training vs validation accuracy one can easily see near 4th epoch a loss is highest and validation is 0. This happens pretty much everytime. I realized that what makes the model better may be because it may be kicking on the extra layer - of squeeze and excites.
The more and more added layers - does not improve the accuracy by the amount we hoped would but at the same time it reduces the processing time.
Proposal for the final solution design:¶
- What model do you propose to be adopted? Why is this the best solution to adopt?
In terms of performance on my two machines - i would select model 3 - with squeeze and select. The reason for it is very deccent perfomance numbers on Apple Mac M1 with 16 GB and Macbook Air M3 with 16 GB. It has the lowest false/positive across board by the numbers. It has the highest accuracy results for the test validation set.
It does the job and it does not drain my compute resources or need special compute device if i have to train, hypertune parameters and so on so forth. In the real life deploying a solution which is easy, adaptive and incremental is always better. Becuase you may not stick with the model forever you will need an improvement and if the model is heavy to build and fit - deploying would required special resource and it may not justify the cost.
Also the accuracy - and false negative/positive numbers are best of this model. So may final propsal for the model is Model 3.
Final Conclusion¶
Executive Summary¶
A classification of infected/uninfected red blood cell with malaria parasite can be exhausing task and requires experience and expertise. This project models AI/ML based solution to classify red blood cell image into malaria infected or uninfected image. To perform the classification it uses multiple layer CNN model with squeeze and excited layer, interleaved with batch normalization layer. This model provides its best performance with 0.986 accuracy and loss of 0.050. It has only total 34 of false positive/false negative out of 1600. This would help reduce the effort of red blood cell image detection of malaria parasite infection and provide optimized way to find out legion of infection.
The deployment of this model does not require any special hardware and we can leverage good Mac/PC to deploy this model in bare metal as well as cloud environment.
Problem Recap¶
Malaria is contagious disease caused by Plasmodium parasites that are transmitted to humans throught the bites of infected female Anopheles mosquitoes. Once the parasite enters blood it begins damaging red blood cells(RBCs). A traditional diagnosis of malaria in the lab requires careful inspection by an experienced professional to descriminate between healthy and infected RBC. This taks is tedious and time-consuming also diagnosis accuracy depends on human soft skills performing activity.
An automated system based on AI/ML can help with early and accurate detection of malaria, The applicaiton based on AI can have much higher accuracy rate than manual classificaiton. This model once ready based on the training and testing data-set and trained with AI/ML algorithm can take RBCs image and provide a classification if the image is infected with parasite or not.
Solution Summary¶
As it is noticed from the above solution - Model 3 is best perfoming model. It can be extended to make further refinements based on implementation. THe solution can be deployed on any high end cpu - like apple m3 silicon as tested. The input can be individual images of infected/uninfected red cells image - the ouput of the model can be a classificaiton says if the image seems infected by malaria parasite or not.
implementation:¶
- for implementaiton consideration the solution can be designed as layer approach
- a web-based ui that can take/upload images for the processing -
- a image processing server can establish validtion on image an once confirm
- it stores the unclassified image onto non-relation database with directory structure that can be processed by tensorflow dataset
- a non-relation database structure that relies on training and testing directory with clear label.
- a unclassified images can be then processed by the image processing service to fed into model
- a model classifies the images and the responses are stored in to database that can be responsed back to the user
- Once the classified image is confirmed by the human about accuracy the image can be sorted out to test and training set bassed on 1:5 basis.
- this sorting can help training the model further and improve on the validation periodically.
- one the processing is done - a classified image - results can be resonated back to user
- a human who aids in monitoring the process and making sure the model has accuracy we need comits the image after manually verifyign it.
- the above verifiaiotn process can be done on selective basis and slowly be less invasive as the model become more mature.
- A training model may required dedicated compute
- for non-relation database a mongo db or certain database can be used.
- a tensorflow image processing can directly map the db directory into its data set
Challanges¶
- image validation can be a challenge and need to be detailed out.
- Make sure model is able to only detect parasite infected with maleria parasite
- what happens if the red blood cell image has infection - but not malaria parasite may be different parasite
- what output to provide in case if it is red blood cell image but infection is not what the model can classify.
- a matual model takes time till it is achieved a sampling or manual human verification may be needed or warranted
Enhancements:¶
- Can model be enhanced or can the solution be enhanced for different parasite recoginition
- can the test be provided that can probably find more infection and desease if thay are or can be classified using the red blood cell image classificaiotn.